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Abstract —Artificial agents are becoming artificial companions, 
interacting with the user on a long-term basis. This evolution 
brought new challenges to the affective computing domain, such 
as designing artificial agents with personalities to the benefits of 
the user. Endowing artificial agents with personality could help 
to increase the agent’s believability, hence easing the interaction. 
This paper touches on two questions pertaining to computational 
personality modeling: 1/ how to produce artificial personalities 
which can inform personality researchers, whether from com- 
puter sciences or psychology and 2/ will behaviors produced by 
artificial agents be perceived by users as putting the programmed 
personality across as such. We propose to use a data-driven 
approach to endow artificial agents with personality, using the 
regulatory focus theory as a framework. We used machine- 
learned game strategies, in the form of alternative decision 
trees computed from human data, to convey the personality of 
artificial agents. We then tested whether these personalities can 
be perceived by users after playing a game against these agents. 
We used two artificial agents as controls: one randomly playing 
and one with an ”average / depersonalized” strategy. On the 
one hand, our results show that agents’ regulatory focus, when 
programmed, can be accurately perceived by users. On the other 
hand, our results also point out that personality will be perceived 
by users even if the agent’s design does not intend to transmit 
one. 


I. INTRODUCTION 


In the last decade, software agents were brought to a new 
level, due to technological evolutions : artificial agents ceased 
to be only human-computer interfaces to become artificial 
companions [1]. On the one hand, an artificial companion can 
be defined as ”a personalised, multi-modal, helpful, collabora- 
tive, conversational, learning, social, emotional, cognitive and 
persistent computer agent that knows its owner, interacts with 
the user over a long period of time and builds a (long-term) 
relationship to the user” [2]. On the other hand, personality 
can be defined as a coherent patterning of affect, behavior, 
cognition, and desires (goals) over time and space [3]. Know- 
ing that credibility (i.e. the capacity of being perceived as 
believable and convincing [4]) is assessed by the consistency 
and the coherence of an artificial entity at various levels 
(psychological and physical; intrapersonal as well as social [5], 
[6]), endowing artificial companions with personality could 
help to increase the companion’s believability, hence easing the 


interaction and, thereby, producing an adequate environment 
for a relationship” to take place. 

Now the question is: how to endow artificial entities with 
personality? Following Vinciarelli and Mohammadi [7], we 
like to think of personality as ”a common ground where 
multiple disciplines, including computing and psychology, can 
contribute and mutually benefit from each other: progress in 
personality theory should help to build more effective per- 
sonality machines and vice versa”. So, turning to personality 
psychology, computer scientists have found two principal types 
of models: traits models and socio-cognitive models. Both 
have been used in affective computing, the former more than 
the latter. Yet, socio-cognitive models, which are interested in 
structures and processes of personality, provide an interesting 
background to computer scientists. With interpretable compu- 
tational models relying on socio-cognitive theories, affective 
computing can inform psychologists on the validity” of these 
ones or, at least, their capacity to represent a process resulting 
in consistent behaviors. If implementing artificial personalities 
is an open research question, perceiving personality should not 
be an issue for the user: people can find personality in many 
things, from moving geometrical shapes [8] to their own cars 
[9]. But may users perceive the intended personality chosen 
by programmers for their artificial companions? 

To try to answer these questions, we adopt the following 
process: 1/ run a experiment to collect human data; 2/ use a 
classifier to determine the important features guiding human’s 
behaviors; 3/ design artificial agents based on the result of 
the classification; 4/ set a user study to verify the users’ 
perception of agents’ personalities. Most of the works in 
affective computing are theory-driven. Theory-driven models 
transform the (mostly qualitative) knowledge provided by 
psychological models into an implementation, symbolic or 
not, by a deductive logic. The issue lies in the gap between 
the constraints of implementation and the generality of the 
psychological models used. That is why we propose to take 
a data-driven approach, i.e. our implementation will try to 
produce behaviors as close as possible to empirical data by 
using classification methods which could produce interpretable 
outputs. That way, it may be possible to use an inductive logic 


to infer more theoretical mechanims and yet, our results could 
feed future symbolic implementation. 

In this article, we start with related works in the field of 
psychology and affective computing (Section II). Further, we 
will present our data-driven modelisation of personality, based 
on the regulatory focus theory, for an artificial agent playing a 
board game with a user (Section III), along with the user study 
about perception of the agent’s personality and credibility 
(Section II-D). Then, we present and discuss the results of 
this study (Section IV) and finally, we conclude and propose 
future directions (Section V). 


II. RELATED WORK 
A. Personality in psychology and affective computing 


In 2000, Nass and Moon [10] suggest the Computers As 
Social Actors (CASA) paradigm. The CASA paradigm states 
that people tend to adopt social attitudes with machines 
that can elicit social heuristics. Personality can be attributed 
by users to a computer and have an influence over users’ 
behaviors. In this view, designing specific personalities that 
are perceived such as they were designed seems especially 
important for any computer scientist taking interest in artificial 
personality. If computer scientists make sometimes their own 
model of personality [11], most of the works in affective 
computing lean on psychological models, such as the Five 
Factors Model (FFM) [12] which defines five traits of person- 
ality. Because FFM is a dominant and well-known model in 
personality psychology, this model is naturally becoming the 
most used reference in affective computing [13]. The majority 
of FFM-inspired models use symbolic implementation but 
some propose a network approach, such as [14] which create a 
neural network based on 240 items of the NEO-PI-R (a FFM 
questionnaire for assessing traits of personality) and suggest 
that the stability of personality comes from the stability of this 
network structure. But FFM and other traits theories propose a 
descriptive approach of personality. Thus, this kind of theories 
helps to grasp ”what” is personality but cannot guide computer 
scientists on the ”how”: how to link behaviors to personality. 
On the contrary, socio-cognitive models are explicative per se. 

The socio-cognitive approach to personality attempts to un- 
derstand cognitive and social processes that lead to personality 
and underlines the importance of a situation in exhibiting 
personality behaviors [15]. For that purpose, it focuses on 
the interaction between the person and the social context 
and highlights the intra-individual differences [16]. There 
are few works taking a socio-cognitive approach in affective 
computing. Among them, it is not uncommon to find machine- 
learning mechanisms and especially neural networks models: 
the SPOT (Simulating Personality Over Time) model combines 
the five traits of FFM with situational factors inside a neural 
network which determine output behaviors of virtual agents 
[17]; the BIS/BAS theory (Behavioral Inhibition System / 
Behavioral Approach System) has been operationalized in 
a neural network model of personality which links situa- 
tional features, resources and motivations [18]. After training, 
simulations showed this network’s ability to produce stable 


behavioral signatures (i.e. consistent associations between situ- 
ations and behaviors). Nonetheless, neural network approaches 
have a major drawback: unlike symbolic implementation, it 
lacks explanatory power when trying to understand personality 
processes [19]. 


B. Using the regulatory focus theory 


To try to answer our first question - how to endow ar- 
tificial entities with personality? - we propose to take a 
socio-cognitive perspective by using the regulatory focus as 
a framework, as suggested by [20]. The regulatory focus 
theory is of interest here because it provides insights about 
how different personalities can be represented as functions of 
different processes. 

The regulatory focus theory [21] distinguishes between two 
self-regulation strategies: 1/ promotion-focus, look into with 
the presence or absence of positive outcomes, gains versus 
nongains and 2/ prevention-focus, look into with the presence 
or absence of negative outcomes, losses versus non-losses. 
According to this theory, promotion-focus people would be 
more prone to using their ideal selves as guides for their be- 
haviors (i.e., they are looking for being what they want to be) 
than prevention-focus people, who would prefer using ought 
selves (1.e., they are looking for being what they think they 
have to be). Promotion and prevention are two independent 
dimensions. One person has both a promotion-focus and a 
prevention-focus score. Regulatory focus can be situational, 
i.e. induced by context, but theory states that people have a 
chronic focus, i.e. an ”’habitual” focus used by default. 


C. Conveying personality via game strategies 


As an application, we selected strategy during a game 
as the first and only modality for expressing the artificial 
agent’s personality. Several links have been made between 
personality and games, in psychology ([22], [23]) and in 
computer sciences ([24], [25]). Games are quite relevant for 
designing and evaluating affective agents ([26], [27]). We 
selected a board game, named ”Can’t Stop” (designed by Sid 
Sackson [28]). At each turn of this game, the player has to 
choose between either stopping a turn, i.e. saving the current 
gains but losing in speed, or playing again, i.e. taking the 
risk of losing the current gains to win more. We selected this 
very game because it enables to study strategies in terms of 
promotion and prevention since at each turn the player has 
to 1/ select a movement that can be more or less risky and 
2/ make a choice between a vigilant strategy (stopping) or an 
eager strategy (playing again). 


III. DESIGNING ARTIFICIAL AGENTS WITH A REGULATORY 
FOCUS PERSONALITY 


We designed artificial agents playing a board game in which 
strategy can convey the agent’s focus. In order to have data to 
compute machine-learned data-driven strategies for our agents, 
we choose to record human-vs-human game sessions. With the 
self-assessment of participants’ regulatory focus, the games’ 
data can be used to learn the strategies set by human players 


and test whether these strategies are influenced by the player’s 
personality. Our methodology is composed of the following 
steps: 1/ learning human strategies by collecting human data 
(i.e. record actions of humans playing the game and measure 
their regulatory focus) and extract humans’ strategies by using 
a classifier to determine the important features guiding their 
behaviors; 2/ designing “regulatory focus endowed” agents 
based on this classification, along with control agents ’with- 
out personality”; 3/ test whether users perceive the intended 
agents’ personalities in a controlled user study. 


A. Learning human strategies 


1) Data collection: Twelve dyads, composed of fifteen 
participants (13 men, 2 women ; age M = 29,7 years, SD 
= 10,2), played Can’t Stop games. Prior to the game session, 
each participant had answered the Regulatory Focus Question- 
naire Proverbs Form (RFQ-PF), our own French questionnaire 
measuring the strengh of the two self-regulatory strategies 
with 18 questions to answer on a 7-point Likert scale. A 
psychometric study (N = 277) validated the capacity of this 
questionnaire to measure chronic regulatory focus (in prep). 
Participants played via computers. To mimic a game with 
a distant opponent like an artificial agent, participants were 
seeing each other through webcams. They were allowed to 
talk to each other during the game. 

2) Extracting strategies: To learn the behaviors, we trained 
two classifiers, one for each decision of the game: the choice 
of a movement and the stop-or-again decision. Features were 
related to the global state of the game (e.g. score difference 
at the time, numbers of playable columns), characteristics of 
candidate moves (e.g. distance from the top of a column, 
absolute and relative to the other possible moves) or statistics 
on the whole game (e.g. number of turns since the last loss). 
Both classifiers were binary classifiers. For the first classifier 
(movement decision), the target class was either Selected” 
or ”’NotSelected”, and the features considered two alternative 
moves, including the human selected move!. The dataset was 
composed of 1630 instances and 108 features. For the second 
classifier (stop-or-again), we considered one instance for each 
human player decision, with target class ”Continue” if the 
player chose to continue and ’Stop” otherwise. The dataset 
was composed of 670 instances and used 47 features (the 
number of features is higher for the first dataset since two 
alternative moves are considered). 

3) Evaluating the classifiers: We choose to use the Alterna- 
tive Decision Tree classifier (ADTree) [29] with 10 boosting 
iterations because the classifier (learned tree) is easy to in- 
terpret, which was required for our analysis to guide future 
symbolic implementation; the quality of the result was similar 
to other possible classifiers’. To determine the accuracy of 
the classifiers, we performed a 10-fold cross-validation and 
benchmarked our results against different classifiers which 


lFor each human movement decision HD and each possible alternative AD, 
we added two instances, one with (HD,AD) and target=’Selected” and one 
with (AD,HD) and target=”NotSelected” 


2We proceed to classification with Weka (version 3.7) 


TABLE I 
STATISTICS FROM THE 10-FOLD STRATIFIED CROSS-VALIDATION OF 
THREE DIFFERENT CLASSIFIERS ON PLAYING MODELS 





Movement choice model 
ZeroR  ADTree 





RandomForest 





Incorrectly categorized items 50,3% 16,5% 17,2% 

Kappa statistic 0 0,67 0,66 

ROC Area 0,50 0,92 0,91 
Stop or again decision model - without personality 


ZeroR  ADTree  RandomForest 





Incorrectly categorized items 22,5% 23,4% 22,1% 

Kappa statistic 0 0,25 0,22 

ROC Area 0,50 0,79 0,77 
Stop or again decision model - with personality 

ZeroR  ADTree 


Incorrectly categorized items 22,5% 19.70% 21,8% 
Kappa statistic 0 0,33 0,23 
ROC Area 0,50 0,80 0,76 





RandomForest 





finally presented equivalent performances compared to the 
ADTree classifiers. 

As examples, we present in Table I the Zero-R (predicting 
the more frequent class, regardless of predictors; used as 
a baseline performance) the Random Forest (with 10 trees; 
known for good results with similar data) classifiers’ perfor- 
mances. We looked at the number of incorrectly classified 
items, the kappa statistic and the receiver operating charac- 
teristic (ROC) area. 

We observed that the ADTree’s performances are suffi- 
ciently good compared to the other classifiers, and that the 
kappa statistic is at least fair for the three models. Thus, we 
considered the output of the ADTree classifier as consistent 
and coherent. For the movement choice model, personality 
scores were not selected as a feature in the tree. But personality 
scores were selected for the decision model and the classifier 
was more efficient when personality scores were taken into 
account. As an example, Figure 1 shows one branch of the 
ADT for the ’stop-or-again” decision. At first, the number of 
throws in the turn is taken into account. If the player had 
already play one turn, then the quality of the position on the 
board is evaluated; else it depends on the promotion score. 


B. Designing the artificial agents 


In order to test the users’ perception of agents’ personality, 
we wanted a (scientifically speaking) control agent: an agent 
playing without personality. In psychology, there is no such 
thing as a person with no personality. So what could it be 
in affective computing? Two types of strategies are generally 
used as control in the domain: random strategy (but does 
the absence of planned consistency convey an absence of 
personality?) or traditional AT” strategy (but does the absence 
of implemented personality is equivalent to the absence of 
personality?). 

Thus we considered to have 4 types of agent: 

e the random agent (Rand), which chooses randomly its 

moves and has a 50% probability to stop its turn; 

e the “average” agent (Avg), which follows an ADTree 


Fig. 1. One branch of the Alternative Decision Tree for the decision model taking personality scores into account 


1: game_nbThrowsInThisTurn 


3: game_ratioBest3 
-0.615 (-ve = continue, +ve = stop) 


5: dec_min%DistAfter 


7: dec_nbPlayableColumns 


created without taking into account personality scores as 
a feature which should lead to a ’depersonalized” strategy 

e the promotion agent (RF-Pro), which has a promotion 
score of 7 and a prevention score of 1; 

e the prevention agent (RF-Pre), which has a promotion 
score of 1 and a prevention score of 7. The so-called RF- 
agents follow the same ADTree, where some branches 
are conditionned by the value of personality scores. 


Users easily assign personality to things but, with the Rand- 
agent, the perceived pattern may be different for each user due 
to variability in behaviors while, with the Avg-agent, all users 
will be confronted with the same strategy, conveying a neither 
promotion nor prevention regulatory focus. 


C. Hypotheses 


Knowing the design of the artificial agents, we made 2 
hypotheses concerning the results of the user study : 


e H1: Differences in agents’ personalities are perceived by 
the human player : 


— Hla: For the rating of personality for the condition 
Rand, the interraters’ agreement is low and the data 
dispersion is high. 

— Hb: For the rating of personality for the condition 
Avg, the interraters’ agreement is good and the data 
dispersion is low. The mean values of promotion and 
prevention scores are around the middle of the scale. 

— Hic: For the rating of personality for the conditions 
RF, the interraters’ agreement is good and the data 
dispersion is low. The mean value of promotion score 
is on the upper part of the scale for the RF-Pro agent 
(and respectively for the prevention score concerning 
the RF-Pre agent). 


e H2: The credibility of the agent is increased by the 
presence of personality. The RF-agents are perceived as 
more credible than the Rand-agent and the Avg-agent. 


D. User study 


1) Experimental design: Prior to the study, each subject 
had answered the RFQ-PF to assess their chronic regulatory 
focus. 

During the study, the following procedure was applied. 
First, after signing a consent agreement, the participant was 


< 0.041 1273 
Gus -1.557 2: dec_abs_QualAfter 
>= 0.041 -1.56 
< 3.8 0.625 
>= 15 0.224 8: scorePro 
>= 3.8 -0.067 
< 1.089 
>= 1.089 
< -0.148 
>= -0.148 
<2.5 
>2 2.5 


explained the rules of Can’t Stop by viewing a video tutorial’. 
The possibility was given to the participant to pause the video 
and to go back if they needed to see again some part. Second, 
the participant played a tutorial game against the computer in 
order to become familiar with the game itself and its interface. 
The participant was informed that the computer would make 
random choices during the tutorial game. The experimenter 
answered the potential questions about Can’t Stop. Third, the 
participant was informed that he or she would play 4 games 
against different artificial agents. The participant was also 
informed that he or she would have to evaluate the agent’s 
personality after each game. 

There was no visual display of the artificial agent, the only 
modality to evaluate the agent’s personality was the way the 
agent played the game (see Figure 2). The different conditions 
were counterbalanced to compensate a potential effect of order. 
After each game against an agent, the participant answered 
the RFQ-PF in an other-ratings form (i.e. to characterize the 
agent’s strategies during the game), along with 10 questions 
from the Godspeed Questionnaire [30]: 5 about likeability and 
5 about the perceived intelligence of the agent (5-point Likert 
scale). We used these two scales as a credibility measure 
since perceived goodness and expertise are key dimensions 
of credibility [31]. We used a within-subject design in order 
for participants to have a ’comparative” perception of agents’ 
personalities. However, knowing that a game takes 15 to 20 
minutes to play, participants only played one game with each 
agent to ensure a not too long experimental session. Because 
of the number of turns by game (20 on average), we thought 
that it would be sufficient for participants to make a personality 
judgment. 

2) Participants: Twenty participants took part in this eval- 
uation study. There were 11 men and 9 women (age M = 30,6 
years, SD = 8,1). Of the participants, 17 were native french 
speakers and 3 were bilingual. 


IV. RESULTS* AND DISCUSSION 


First, we looked at the mean and standard deviation for each 
measure. We also computed the coefficient of quartile variation 
(CQV; (Q3 — Q1)/(Q1 + Q3)), which offers a comparable 
statistic of dispersion [32], and the Finn coefficient” as an 


3https://www.youtube.com/watch?v=kQs6JBX0txw 
“Data were analysed using R, version 3.1.2, http://www.R-project.org 
5R package irr, version 0.84 


Fig. 2. A user playing Can’t Stop with an artificial agent during the agent’s 
turn; board game as seen by the user (bottom left) and webcam record of the 
user’s face (top right) 











index of the interraters’ agreement [33]. For interpretation of 
CQV, dispersion of data is proportional to the percentage; for 
interpretation of the Finn coefficient, 0 means total disagree- 
ment and 1 means total agreement. Descriptive statistics are 
presented in Table II. 

For the analysis of the differences between conditions, 
we applied non-parametric statistics, as the assumption of 
normality could not be granted in these conditions. We used 
the Friedman test as principal analysis and pairwise compar- 
isons using Wilcoxon signed rank test as post-hoc test. For 
the post-hoc tests, p-values were adjusted using the Holm 
correction. Friedman tests reported significant differences for 
promotion-score (x?(3) = 23,44;p < 0,001), prevention- 
score (x2(3) = 23,28;p < 0,001) and the perceived in- 
telligence (x?(3) = 15,18;p = 0,002). We do not found 
significant differences for the likeability (yv?(3) = 2, 24; n.s.). 
Post-hoc tests results are presented in Table III. 


A. Dispersion of data and interraters’ agreement 


The CQV of the Rand agent for personality scores is, 
on average, 1.5 times higher than other agents’ CQV. Finn 
coefficients of the Rand agent are the lowest for the promotion 
score and the prevention score (cf Table II). 

This results validate our Hla hypothesis: participants have 
perceived a personality but not all the participants perceived 
the same personality. 


B. Personality scores 


For the promotion score, the RF-Pro agent and the Avg- 
agent are rated high unlike the RF-Pre agent and the Rand- 
agent (cf Table II). Moreover, as shown in Table III, post- 
hoc tests show that the RF-Pro agent is rated significantly 
higher than the RF-Pre agent and the Rand-agent. There is no 
significative difference between the RF-Pro agent and the Avg- 
agent. Likewise, for the prevention score, the RF-Pre agent and 
the Rand-agent are rated high unlike the RF-Pro agent and the 
Avg-agent, as shown in Table II. The RF-Pre agent is rated 
significatively higher than the RF-Pro agent and the Avg-agent. 
There is no significative difference between the RF-Pre agent 
and the Rand-agent (cf Table HI). 

Our H1 hypothesis is partially validated: on the one hand, 
results show that our RF-Pro and RF-Pre agents have been 
respectively perceived as promotion-oriented and prevention- 
oriented, as we expected (Hlc validated). We could say 


that our data-driven strategies successfully convey the agent’s 
regulatory focus. On the other hand, our Rand and Avg agents 
have been respectively perceived as prevention-oriented and 
promotion-oriented (H1b not validated). 

This result points out the difficulty of controlling personality 
perception of virtual agents. As brought up by [34], by 
assessing personality with questionnaires, participants may be 
driven to rate something they had not perceived in the first 
place. Moreover, participants played only one game against 
each agent. The absence of repeated interactions prevents par- 
ticipants from perceiving the consistency in agents’ strategies 
(or the lack of for Rand-agent). Nonetheless, the orientation 
of users’ perception may be explained by looking at effective 
strategies used by the agents: 


e The Rand-agent had a 50% probability of stop its turn 
each time it had to choose. So, the probability P of 
stopping after à choices is P; = x: short turns are more 
probable than long turns, thereby users have been more 
probably exposed to short turns. Short turns could be 
interpreted as a secure strategy, thus it is more probable 
that users rated the Rand-agent as prevention-focus. 

e For the Avg-agent, we analyzed the learned tree to 
understand the promotion-focus orientation in users’ per- 
ception. The tree shows a decision node on a feature 
describing the situation on the board, positive if my 
opponent is better placed than me on the board, negative 
else. If this feature is superior to 1, a prediction node 
of value -1.61 is traversed (if the output of the tree is 
negative, agent choose to play again). So, as soon as 
the opponent has a better situation on the board than 
the agent’s, there is a strong incentive to play again, 
promoting long turns which may have been interpreted 
as cue of a promotion-focus personality. Moreover, even 
if we have not the probabilities for each of the 81 possible 
outputs of the tree, only two of them lead the Avg-agent 
to stop its turn. 


C. Credibility scores 


For the likeability, scores are around 3 (on a 5-point scale) 
for all the agents (cf Table II). This can be the reflect of a light 
overall positive bias towards artificial agents, due to the person 
positivity bias (i.e. a natural tendency of the mind to focus on 
the optimistic at a subconscious level) [35]. The fact remains 
that we found no differences in likeability. Participants orally 
reported difficulties to evaluate likeability, because they found 
that the interaction was not sufficient to judge the agents’ 
sympathy. Game strategies alone might not convey such a 
social concept. This result also raised another question: is self- 
report sufficient to measure likeability, and if not, which kind 
of users’ behaviors can be used to measure this concept? 

For the perceived intelligence, scores (in Table IT) have a 
higher range: from 2,63 (Rand-agent) to 3,85 (RF-Pre agent). 
The RF-Pre agent is rated significantly higher than the others. 
Although the RF-Pro agent was rated as more intelligent than 
the Rand and Avg agents, there is no statistically significant 


TABLE II 
DESCRIPTIVE STATISTICS OF THE DIFFERENT SCORES COLLECTED 
DURING THE STUDY 
Pro Sc. = promotion score (min=1; max=7); Pre Sc. = prevention score 
(min=1; max=7); Lik. = likeability (min=1; max=5); Perc. Int. = perceived 
intelligence (min=1; max=5); CQV = Coefficient of Quartile Variation 














Rand Avg RF-Pro RF-Pre 
Mean 3,53 5,3 5,26 3,09 
SD 1,47 1,44 1,28 1,22 
Pro Sc. CQV 32% 15% 14% 20% 
Finn coeff. 0,46 0,48 0,59 0,63 
Mean 4,51 3,18 2,91 5,58 
SD 1,71 1,4 1,33 0,73 
Pre Sc. CQV 30% 37% 25% 8% 
Finn coeff. 0,27 0,51 0,56 0,87 
Mean 3,3 3,22 3,02 3,51 
Lik. SD 0,78 0,67 0,95 0,75 
CQV 19% 13% 18% 14% 
Mean 2,63 2,94 lt 3,85 
Perc. Int D 0,6 1,16 0,98 0,68 
7 7 CQV 18% 33% 27% 11% 
TABLE III 


P-VALUES OF THE POST-HOC TESTS FOR THE PROMOTION SCORE, THE 
PREVENTION SCORE AND THE PERCEIVED INTELLIGENCE 
* p< 0.05 ; ** p < 0.01 ; *** p < 0.001 


Promotion score 


Rand Avg RF-Pro 
Avg 0,07 - - 
RF-Pro  0,01* 0,42 - 
RF-Pre  0,03* 0,001***  0,001*** 
Prevention score 
Rand Avg RF-Pro 
Avg 0,01* - - 
RF-Pro  0,004** 1,00 - 
RF-Pre 0,51 0,005 ** 0,001 *** 
Perceived intelligence 
Rand Avg RF-Pro 
Avg 0,63 - - 
RF-Pro 0,42 0,79 - 
RF-Pre  0,002**  0,05* 0,02* 


difference between the three agents (cf Table III). Consid- 
ering our number of subjects, we could not say if the non- 
significance is due to a lack of data or to a real difference 
due to the agent’s strategy. We noted that the RF-Pre agent 
won 47% of its games, while it goes from 42% for the RF- 
Pro agent, to 32% for the Avg-agent and down to 5% for the 
Rand-agent. 


V. CONCLUSION AND PERSPECTIVES 


To conclude, we have shown that it is possible to suc- 
cessfully endow artificial agents with regulatory focus with a 
data-driven approach. Moreover, the implemented regulatory 
focus can be accurately perceived by users. We believe that 
regulatory focus could be of use for affective and persuasive 
computing. Indeed, regulatory focus theory comes with the 
concept of regulatory fit. Regulatory fit states that matching 
user’s regulatory-focus and means used to approach one goal 


creates a feeling of rightness about the pursued goal and 
increases task engagement [36]. For example, a user in a 
state of promotion-focus will be more receptive to promotion- 
oriented messages (and respectively for prevention-focus) [37]. 
For artificial companions (trying to build a relationship with 
their users) or persuasive technologies (aiming to change be- 
haviors of users through social influence), creating a situation 
of regulatory fit could be a good place to start. 

As we advocate for a mutually beneficial personality science 
for psychologists and for computer scientists, we believe that 
producing artificial behaviors that will put a personality across 
via data-driven techniques is only a first step. The next step 
will be to generalize the knowledge available in the learned 
trees in order to implement a symbolic goal-based architecture 
for Can’t Stop artificial players and see if we could replicate 
our data-driven results. Indeed, we think of data-driven and 
theory-driven approaches as complementary: the former will 
feed the latter. One theory-driven symbolic implementation 
will computationally approach cognitive processes and provide 
a clearer view of parameters accounting for personality in 
these processes. By making a bridge between data-driven 
results and theory-driven implementation, we hope to provide 
valuable and interpretable cognitive simulations that could be 
used by psychologists to evaluate and enrich their models. 

Besides, we raised methodological concerns about experi- 
mental testing of concepts such as personality or likeability 
in computer sciences (e.g. how to assess them; how to have 
a good control condition). As long as personality perception 
is concerned, we stress the temporal issue of repetition. 
We believe repeated interactions are necessary to go from 
impression formation to a deeper personality judgment. 

As perspectives, we list directions for future works in order 
to try to provide data for answering the questions raised by 
our results and better understand the functioning of personality 
with artificial agents: 

e making more longitudinal studies because only repeated 
interactions could allow users to form a real model of the 
agent’s personality; 

e complementing self-report measures by users’ behaviors 
measures, such as engagement for example; 

e investigating if personality has other impacts on the 
game using subjective and objective measures (e.g. game 
outcome, user’s strategy); 

e implementing an ”’optimally-playing” agent as another 
control; 

e using multi-modality to enhance the interaction, such 
as verbal and non-verbal behaviors during the game by 
providing a physical representation of a virtual agent. 
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